BigDFT.RemoteRunnerUtils module

class RemoteFunction(submitter, name, **kwargs)[source]

Serialize and execute remotely a python function.

With this class we serilize and execute a python function on a remote filesystem. This filesystem may be a remote computer or a different directory of the local machine.

Such class is useful to be employed in runners which needs to be executed outside of the current framework, butfor which the result is needed to continue the processing of the data.

Parameters
  • name (str) – the name of the remote function

  • function (func) – the python function to be serialized. The return value of the function should be an object for which a relatively light-weight serialization is possible.

  • submitter (str) – the interpreter to be invoked. Should be, e.g. python if the function is a python function, or bash.

  • **kwargs – keyword arguments of the function. The arguments of the function should be serializable without the requirement of too much disk space.

files_sent(yes)[source]

Mark the relevant files as sent.

Parameters

yes (bool) – True if the files have been already sent to the remote directory

files_received(yes)[source]

Mark the relevant files as received.

Parameters

yes (bool) – True if the files have been already received in the remote directory

directories_created(yes)[source]

Mark the relevant directories as created.

Parameters

yes (bool) – True if the directories have been tested as present.

prepare_files_to_send()[source]

List of files that will have to be sent.

This function defines the files which requires to be sent over in order to execute remotely the python function. Also set the main file to be called and the resultfile.

Returns

files which will be sent

Return type

dict

prepare_files_to_receive()[source]

List of files that will have to be received.

This function defines the files which requires to be get from the url in order to execute remotely the python function.

Returns

files which will be received

Return type

dict

setup(dest='.', src='/tmp/')[source]

Create and list the files which have to be sent remotely.

Parameters
  • dest (str) – remote_directory of the destination filesystem. The directory should exist and write permission should be granted.

  • src (str) – local directory to prepare the IO files to send. The directory should exist and write permission should be granted.

Returns

list files

Return type

list

send_files(files)[source]

Send files to the remote filesystem to which the class should run.

With this function, the relevant serialized files should be sent, via the rsync protocol (or equivalent) into the remote directory which will then be employed to run the function.

Parameters

files (dict) – source and destination files to be sent, organized as items of a dictionary.

append_function(extra_function)[source]

Include the call to another function in the main runfile.

With this method another remote function can be called in the same context of the remote call. Such remote function will be called from within the same remote directory of the overall function. The result of the remote function will then be the result provided from the last of the appended functions.

Parameters

extra_function (RemoteFunction) – the remote function to append.

send_to(dest='.', src='/tmp/')[source]

Assign the remote filesystem to which the class should run.

With this function, the relevant serialized files should be sent, via the rsync protocol (or equivalent) into the remote directory which will then be employed to run the function.

Parameters
  • dest (str) – remote_directory of the destination filesystem. The directory should exist and write permission should be granted.

  • src (str) – local directory to prepare the IO files to send. The directory should exist and write permission should be granted.

call(remotely=True, remote_cwd=False)[source]

Provides the command to execute the function.

Parameters
  • remotely (bool) – invoke the function from local machine. employ ssh protocol to perform the call.

  • remote_cwd (bool) – True if our CWD is already the remote_dir to which the function has been sent. False otherwise. Always assumed as False if remotely is True.

Returns

the string command to be executed.

Return type

str

is_finished(remotely=True, anyfile=True, timeout=- 1, verbose=None)[source]

Check if the function has been executed.

This is controlled by the presence of the resultfile. This runfile has a set name that is generated at initalisation.

Parameters
  • remotely (bool) – control the presence of the result on the remote filesystem

  • anyfile (bool) – determine if any run is finished. Useful if the run in question is asyncronous

  • timeout (int) – maximum number of times is_finished can be called before the check times out, prevents infinite loops. Set to -1 to disable check Set to integer to limit to timeout calls.

  • verbose (bool) – local verbosity for this check

Returns

True if ready, False otherwise.

Return type

bool

is_finished will return the status of the run it is called for. Checks can be made remotely for checking if a remote run has finished, or locally, to check whether the files from a remote run have been transferred back to the local system, or to check if a local run has completed.

Advanced arguments such as anyfile and timeout can greatly enhance the capabilities of these checks.

Anyfile

The anyfile argument states whether is_finished() cares about any file that matches the resultfile name for the run, or should check that the file has been created after the run file was created.

Simply put, add anyfile=True to a call to ensure that the results that get returned from the remote are due to a recent run, and not one that has been run previously and the files have not been cleaned up.

Timeout

The timeout argument will allow is_finished to fail timeout times before returning false. This can be useful for ensuring that a while not run.is_finished loop does not run forever, but still has allowance for waiting for a run to start.

Final Notes

Verbosity can be explicity set for each call, allowing a run where the global verbosity is False to provide info on the run states, or vice versa.

Remotely can also be given, to force a run to check only locally (or remotely)

receive_files(files)[source]

Fetch files back from the run directory as needed. Mark them as received once done.

fetch_result(remotely=True)[source]

Get the results of a calculation locally.

Parameters

remotely (bool) – control the presence of the result on the remote filesystem

Returns

The object returned by the original function

class RemoteJSONPickleFunction(submitter, name, function, required_files=None, output_files=None, **kwargs)[source]

Serialize and execute remotely a python function, serialized with json pickle.

With this class we serialize and execute a python function on a remote filesystem. This filesystem may be a remote computer or a different directory of the local machine.

Such class is useful to be employed in runners which needs to be executed outside of the current framework, but for which the result is needed to continue the processing of the data.

Parameters
  • name (str) – the name of the remote function

  • function (func) – the python function to be serialized. The return value of the function should be an object for which a relatively light-weight serialization is possible.

  • submitter (str) – the interpreter to be invoked. Should be, e.g. python if the function is a python function, or bash.

  • required_files (list) – list of extra files that may be required for the good running of the function.

  • output_files (list) – list of the files that the function will produce that are supposed to be retrieved to the host computer.

  • **kwargs – keyword arguments of the function. The arguments of the function should be serializable without the requirement of too much disk space.

append_function(extra_function)[source]

!skip

class RemoteDillFunction(submitter, name, function, required_files=None, output_files=None, **kwargs)[source]

Serialize and execute remotely a python function, serialized with dill.

With this class we serilize and execute a python function on a remote filesystem. This filesystem may be a remote computer or a different directory of the local machine.

Such class is useful to be employed in runners which needs to be executed outside of the current framework, butfor which the result is needed to continue the processing of the data.

Parameters
  • name (str) – the name of the remote function

  • function (func) – the python function to be serialized. The return value of the function should be an object for which a relatively light-weight serialization is possible.

  • submitter (str) – the interpreter to be invoked. Should be, e.g. python if the function is a python function, or bash.

  • required_files (list) – list of extra files that may be required for the good running of the function.

  • output_files (list) – list of the files that the function will produce that are supposed to be retrieved to the host computer.

  • **kwargs – keyword arguments of the function. The arguments of the function should be serializable without the requirement of too much disk space.

append_function(extra_function)[source]

!skip

class RemoteJSONFunction(submitter, name, function, extra_encoder_functions=None, required_files=None, output_files=None, **kwargs)[source]

Serialize and execute remotely a python function, serialized with JSON.

With this class we serilize and execute a python function on a remote filesystem. This filesystem may be a remote computer or a different directory of the local machine.

Such class is useful to be employed in runners which needs to be executed outside of the current framework, for which the result is needed locally to continue data processing.

Parameters
  • name (str) – the name of the remote function

  • function (func) – the python function to be serialized. The return value of the function should be an object for which a relatively light-weight serialization is possible.

  • submitter (str) – the interpreter to be invoked. Should be, e.g. python if the function is a python function, or bash.

  • extra_encoder_functions (list)) – list of dictionaries of the format {‘cls’: Class, ‘func’: function} which is employed to serialize non-instrinsic objects as well as non-numpy objects.

  • required_files (list) – list of extra files that may be required for the good running of the function.

  • output_files (list) – list of the files that the function will produce that are supposed to be retrieved to the host computer.

  • **kwargs – keyword arguments of the function. The arguments of the function should be serializable without the requirement of too much disk space.

append_function(extra_function)[source]

!skip

class RemoteScript(submitter, name, script, result_file, output_files, **kwargs)[source]

Triggers the remote execution of a script.

This class is useful to execute remotely a script and to retrieve. The results of such execution. It inherits from the RemoteFunction base class and extends some of its actions to the concept of the script.

Parameters
  • name (str) – the name of the remote function

  • script (str, func) – The script to be executed provided in string form. It can also be provided as a function which returns a string.

  • result_file (str) – the name of the file in which the script should redirect.

  • submitter (str) – the interpreter to be invoked. Should be, e.g. bash if the script is a shell script, or ‘qsub’ if this is a submission script.

  • output_files (list) – list of the files that the function will produce that are supposed to be retrieved to the host computer.

  • **kwargs – keyword arguments of the script-script function, which will be substituted in the string representation.

class CallableAttrDict(*args, **kwargs)[source]

Dict-form structure where the contents are accessible as attributes

Example

>>> d = CallableAttrDict({'a': 1})
>>> print(d.a)
>>> 1
>>> print (d())
>>> "{'a': 1}"
>>> d.a = 2
>>> print (d())
>>> "{'a': 2}"